
Keyword: SHAP Analysis
1 result found.
Methodological Paper
Epidemiology and Health Data Insights, 1(4), 2025, ehdi015, https://doi.org/10.63946/ehdi/17159
ABSTRACT:
Chronic diseases remain a leading cause of global mortality, underscoring the need for developing reliable models that predict mortality prediction to guide individualized treatments and optimize resource allocation. This methodological note presents a reproducible framework for predicting one-year mortality in chronic disease patients using large-scale administrative healthcare data. The approach employs retrospective cohort design, year-specific subcohorts, and a stratified 5-fold cross-validation using a broad range of machine learning models. Performance is assessed with multiple metrics, including AUC, sensitivity, specificity, and balanced accuracy, to account for class imbalance. Model interpretability is enhanced through SHapley Additive exPlanations (SHAP), enabling identification of key mortality predictors and their directional impact. The proposed framework is general and can be applied to different chronic diseases. It has already been successfully demonstrated in nationwide cohorts of patients with diabetes mellitus and chronic viral hepatitis in Kazakhstan, achieving AUC values of 0.74–0.80, comparable to international benchmarks despite relying on administrative data alone. The method is scalable and adaptable, allowing integration of laboratory and clinical data with feature selection to address high-dimensionality challenges. Its generalizability and clinical relevance, however, should be validated in practice using enriched datasets across additional chronic diseases and diverse populations.